Spring 2026: Math 291 Daily Update

Tuesday, January 20

After reviewing class procedures, and items on the syllabus, we began a discussion of \(2\times 2\) matrices over \(\mathbb{R}\), the set of which we denote by \(\textrm{M}_2(\mathbb{R})\). Given \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), we identified the entries of \(A\) as follows: \(a\) is the (1,1) entry, \(b\) is the (1,2) entry, \(c\) is the (2,1) entry and \(d\) is the (2,2) entry.

We established two fundamental operations:

  1. (i) Matrix addition: Given \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), \(B = \begin{pmatrix} e & f\\g & h\end{pmatrix}\), define \(A+B := \begin{pmatrix} a+e & b+f\\c+g & d+h\end{pmatrix}\).
  2. (ii) Scalar multiplication: Given \(\lambda \in \mathbb{R}\) and \(A\in \textrm{M}_2(\mathbb{R})\), define \(\lambda \cdot A = \begin{pmatrix} \lambda a & \lambda b\\\lambda c & \lambda d\end{pmatrix}\).

We then discussed at length the following properties. In what follows, \(A, B, C\in \textrm{M}_2(\mathbb{R})\) and \(\lambda, \lambda_1, \lambda_2 \in \mathbb{R}\).

1. An additive identity exists: For \({\bf 0}_{2\times 2} := \begin{pmatrix} 0 & 0\\0 & 0\end{pmatrix}\), we have \({\bf 0}_{2\times 2}+A = A\), for all \(A\in \textrm{M}_2(\mathbb{R})\).

2. Additive inverses exist: Given \(-A := \begin{pmatrix} -a & -b\\-c & -d\end{pmatrix}\), we have \(-A+A = {\bf 0}_{2\times 2}\).

3. Addition is commutative: \(A+B = B+A\).

4. Addition is associative: \((A+B)+C = A+(B+C)\), for \(A, B, C\in \textrm{M}_2(\mathbb{R})\).

5. Scalar multiplication distributes over matrix addition: \(\lambda \cdot (A+B) = \lambda \cdot A+\lambda \cdot B\), for \(\lambda \in \mathbb{R}\).

6. Scalar addition is distributive: \((\lambda_1+\lambda_2)\cdot A = \lambda_1\cdot A+\lambda_2\cdot A\), for \(\lambda_1,\lambda_1\in \mathbb{R}\).

7. Scalar multiplication is associative: \((\lambda_1\lambda_2)\cdot A = \lambda _1\cdot (\lambda_2\cdot A)\).

8. \(1\cdot A = A\) and \(0\cdot A = {\bf 0}_{2\times 2}\).

We also discussed how one might prove these identities and noted that the properties above will be recurring throughout the semester as we discuss abstract vector spaces. We ended class by discussing the following consequences of properties (1)-(8) above. Keeping the same notation, we have

  1. (i) \(-1\cdot A = -A\).
  2. (ii) Additive inverses are unique, i.e., if \(A+C = {\bf 0}_{2\times 2}\), then \(C = -A\).
  3. (iii) Cancellation holds for matrix addition, i.e., if \(A+B = A+C\), then \(B = C\).
Thursday, January 22

In today's lecture we introduced two new operations for \(2\times 2\) matrices, namely multiplication of a matrix times a column and multiplication of two (\(2\times 2\)) matrices.

For \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), \(B = \begin{pmatrix} e & f\\g & h\end{pmatrix}\), \(C = \begin{pmatrix} u\\v\end{pmatrix}\), we defined

  1. (i) \(A\cdot C = \begin{pmatrix} a & b\\c & d\end{pmatrix} \cdot \begin{pmatrix} u\\v\end{pmatrix} := \begin{pmatrix} au+bv\\cu+dv\end{pmatrix}\).
  2. (ii) \(A\cdot B = \begin{pmatrix} a & b\\c & d\end{pmatrix}\cdot \begin{pmatrix} e & f\\g & h\end{pmatrix} = \begin{pmatrix} ae+bg & af+bh\\ce+dg & cf+dh\end{pmatrix}\).

We noted that the (1,1) entry of \(AB\) is \(R_1\cdot C_1\), the (1,2) entry is \(R_1\cdot C_2\), the (2,1) entry is \(R_2\cdot C_1\) and the (2,2) entry is \(R_2\cdot C_2\), where \(R_i\) is the \(i\)th row of \(A\) and \(C_j\) is the \(j\)th column of \(B\). We also noted that if we think of \(B\) as the matrix with columns \(C_1, C_2\), i.e., \(B = [C_1\ C_2]\), then \(AB = [AC_1\ AC_2]\), the matrix with columns \(AC_1, AC_2\).

We discussed how we can use the product of a matrix times a column to re-write a system of equations as a single matrix equation, as follows. Given the system of two equations in two unknowns

\[\begin{align*} ax+by &= u\\ cx+dy &= v, \end{align*}\]

We can write this as \(AX = L\), where \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), \(X = \begin{pmatrix} x\\y\end{pmatrix}\) and \(L = \begin{pmatrix} u\\v\end{pmatrix}\).

We then discussed powers of \(2\times 2\) matrices and the class calculated \(A^2, A^3, A^4\) and conjectured the value of \(A^{2026}\), for \(A = \begin{pmatrix} 1 & 0\\1 & 1\end{pmatrix}\). We were easily able to conjecture \(A^n = \begin{pmatrix} 1 & 0\\n & 1\end{pmatrix}\), for all \(n\geq 1\), which led to a discussion of how to use mathematical induction to prove this fact. One first establishes the base case \(n = 1\), which in this case is clear. One then shows that the \(n-1\) case implies the \(n\)th case - the inductive step, which in this case amounted to showing that \(A\cdot \begin{pmatrix} 1 & 0\\n-1 & 1\end{pmatrix} = \begin{pmatrix} 1 & 0\\n & 1\end{pmatrix} = A^n\). The class then used induction to prove the formula \(1+2+\cdots + n = \frac{n(n+1)}{2}\).

We moved on to discuss (but not prove) the following

Properties of matrix multiplication. Let \(A, B, C\) be \(2\times 2\) matrices.

  1. (i) \({\bf 0}_{2\times 2}\cdot A = {\bf 0}_{2\times 2} = A\cdot {\bf 0}_{2\times 2}\).
  2. (ii) For \(I_2 := \begin{pmatrix} 1 & 0\\0 & 1\end{pmatrix}\), \(A\cdot I_2 = A = I_2\cdot A\), i.e., a multiplicative identity exists.
  3. (iii) Multiplication distributes over matrix sums: \(A\cdot (B+C) = A\cdot B+A\cdot C\).
  4. (iv) Multiplication is associative: \(A(BC) = (AB)C\).
  5. (v) A matrix \(D\) satisfying \(AD = I_2 = DA\) is called an inverse of \(A\) and is denoted \(A^{-1}\).

We finished class by noting that if the matrix equation \(AX = L\) represents a system of equations (as above) and \(A\) has an inverse, then we can multiply both sides of the matrix equation by \(A^{-1}\) to get the solution \(X = A^{-1}L\).

Tuesday, January 27

We began class with the following definition. For the \(2\times 2\) matrix \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\), the determinant of \(A\), denoted \(\det A\), equals \(ad-bc\).

We then discussed and verified the following

Properties of the determinant. Let \(A, B\) denote \(2\times 2\) matrices over \(\mathbb{R}\).

  1. (i) If \(A'\) is obtained from \(A\) by multiplying a row or column of \(A\) by \(\lambda \in \mathbb{R}\), then \(\det A' = \lambda \det A\).
  2. (ii) If \(A'\) is obtained from \(A\) by interchanging its rows or interchanging its columns, then \(\det A' = -\det A\).
  3. (iii) If \(A'\) is obtained from \(A\) by adding a multiple of one of its rows to another row, then \(\det A = \det A'\). Similarly for the columns of \(A\).

The operations (i)-(iii) are called elementary row or column operations.

  1. (iv) \(\det AB = \det A\cdot \det B\).
  2. (v) Suppose \(\det A \not = 0\). Set \(\Delta := \det A\). Then \(A^{-1}\) exists and we have \(A^{-1} = \begin{pmatrix} \frac{d}{\Delta} & -\frac{c}{\Delta}\\-\frac{b}{\Delta} & \frac{a}{\Delta}\end{pmatrix}\).
  3. (vi) Given vectors \(u = (a,b)\) and \(v = (c,d)\), the area of the parallelogram in \(\mathbb{R}^2\) spanned by \(u\) and \(v\) is \(|ad-bc|\), i.e., the absolute value of \(\det \begin{pmatrix} a & b\\c & d\end{pmatrix}\).

We ended class by looking at a typical system of two linear equations in two unknowns

\[\begin{align*} (1)\quad ax+by &= u\\ (2)\quad cx+dy &= v. \end{align*}\]

We noted that each equation corresponds to a straight line \(L_1, L_2\) (respectively) in \(\mathbb{R}^2\) and \((s,t)\) is a solution to the system if and only if \((s,t)\) is a point on each line. Thus the following are the only possibilities for the solution set to the given system of equations:

  1. (i) There is a unique solution. This occurs when \(L_1\) and \(L_2\) are not parallel, and thus intersect in a single point.
  2. (ii) There is no solution. This occurs when \(L_1\) and \(L_2\) are parallel.
  3. (iii) There are infinitely many solutions. This occurs when \(L_1 = L_2\), so that \((s,t)\) is a solution to the system if and only if it is a solution to the first (or second) equation.

Thus, there can never be a \(2\times 2\) system of linear equations with exactly 17 solutions! (Or with exactly \(n\) solutions for any \(n > 1\).)

Thursday, January 29

In the previous lecture we saw that given a system of linear equations

\[\begin{align*} ax+by &= u\\ cx+dy &= v \end{align*}\]

whose coefficient matrix \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\) has non-zero determinant \(\Delta\), then the solution to the system is given by

Cramer's Rule. For the system above, with \(\Delta \not = 0\), \(x = \frac{\det\begin{pmatrix} u & b\\v & d\end{pmatrix}}{\Delta}\) and \(y = \frac{\det\begin{pmatrix} a & u\\c & v\end{pmatrix}}{\Delta}\).

We noted that for large systems of linear equations, Cramer's rule is not cost effective, so we began a discussion of Gaussian elimination. We started with a specific system of equations (like)

\[\begin{align*} 2x+6y &= 8\\ 3x+y &= 4 \end{align*}\]

and performed a sequence of operations that changed the system, but preserved the solution. These operations were of the following form: Interchange equations, add a multiple of one equation to another equation, and multiply an equation by a non-zero number. This simplified the system to one trivially solvable, namely

\[\begin{align*} x &= 1\\ y &= 1. \end{align*}\]

We noted that in doing the various operations, the arithmetic involved the coefficients in the equations and the variables were essentially placeholders. This led to considering the corresponding augmented matrix \(\begin{bmatrix}2 & 6 & | & 8\\3 & 1 & | & 4\end{bmatrix}\). By performing the same operations on the rows of the augmented matrix that we did on the system of equations, this led to the augmented matrix \(\begin{bmatrix}1 & 0 & | & 1\\0 & 1 & | & 1\end{bmatrix}\), which corresponds to the system

\[\begin{align*} x &= 1\\ y &= 1. \end{align*}\]

We formalized this process by defining

Elementary Row Operations. Let \(A\) be a \(2\times 2\) matrix (or any matrix in fact). The following constitute elementary row operations:

  1. (i) Interchange two rows.
  2. (ii) Add a multiple of one row to another row.
  3. (iii) Multiply a row by a non-zero number.

We noted that the goal was to put, if possible, the beginning augmented matrix \(\begin{bmatrix}a & b & | & u\\c & d & | & v\end{bmatrix}\) into the form \(\begin{bmatrix}1 & 0 & | & s\\0 & 1 & | & t\end{bmatrix}\), from which the solution \(x = s, y = t\) could be read. We noted that the strategy for the Gaussian elimination process should be as follows: Using elementary row operations, first get a 1 in the (1,1) entry of the augmented matrix, then use that 1 to get 0 below it. Then make, if possible, the (2,2) entry of the augmented matrix 1 and then use that 1 to get a 0 above it. This will always be possible when the original system has a unique solution.

We ended class by considering the remaining two cases. In one case, the final augmented matrix took the form \(\begin{bmatrix}1 & 3 & | & 4\\0 & 0 & | & 0\end{bmatrix}\), which corresponds to the system \(x+3y = 4\), from which one concludes \(x = 4-3y\). To describe the solution set we introduced another parameter \(t\) to get \(\{(4-3t, t)\ |\ t\in \mathbb{R}\}\) as the solution set. This is the case when the system has infinitely many solutions. We then saw an example where the final augmented matrix took the form \(\begin{bmatrix}1 & 3 & | & 4\\0 & 0 & | & 1\end{bmatrix}\), which meant the original system had no solution, since \(0 = 1\) is a contradiction.

Tuesday, February 3

After reviewing the possible outcomes for solving a system of two linear equations in two unknowns using Gaussian elimination, we considered the following \(2\times 3\) system of equations

\[\begin{align*} 2x+5y+2z &= 9\\ x+2y-z &= 4. \end{align*}\]

As expected, we began with the augmented matrix \(\begin{bmatrix}2 & 5 & 2 & | & 9\\1 & 2 & -12 & | & 4\end{bmatrix}\), applied elementary row operations with the same strategy as in the \(2\times 2\) case and arrived at \(\begin{bmatrix}1 & 0 & -9 & | & 2\\0 & 1 & 4 & | & 1\end{bmatrix}\), from which we derived the solution set \(\{(2+9t,1-4t,t)\ |\ t\in \mathbb{R}\}\), which is a one parameter solution set.

We then considered the system

\[\begin{align*} 2x+4y-2z &= 8\\ x+2y-z &= 4. \end{align*}\]

Following the same procedure led to the augmented matrix \(\begin{bmatrix}1 & 2 & -1 & | & 4\\0 & 0 & 0 & | & 0\end{bmatrix}\), showing that the solution set is \(\{(4-2t_1+t_2, t_1, t_2)\ |\ t_1, t_2\in \mathbb{R}\}\), a two parameter solution set. We then recorded the fact that if we start with a system of two linear equations in three unknowns, the original augmented matrix \(\begin{bmatrix}a & b & c & | & u\\d & e & f & | & v\end{bmatrix}\) takes one of the following three forms after performing elementary row operations:

A
\[\begin{bmatrix}1 & 0 & * & | & *\\0 & 1 & * & | & *\end{bmatrix}\]
B
\[\begin{bmatrix}1 & * & * & | & *\\0 & 0 & 0 & | & 0\end{bmatrix}\]
C
\[\begin{bmatrix}1 & * & * & | & *\\0 & 0 & 0 & | & \alpha\end{bmatrix}\]

where \(\alpha \neq 0\) in case C. We noted that the solution set is infinite in case A, with a one parameter solution, the solution set in Case B is infinite with a two parameter solution set and that there was no solution in Case C. In particular, no \(2\times 3\) system of linear equations has a unique solution. We further noted that in all cases we have seen, there is the

Important fact. The number of independent parameters needed to describe the solution set is the number of variables minus the number of leading ones that appear in the final augmented matrix.

After noting that this important fact applies to systems of equations of any size, we then formalized the form the final matrix should take in the Gaussian elimination process. This applies to systems of linear equations of any size.

Reduced Row Echelon Form. An augmented matrix is in reduced row echelon form (RREF) if it satisfies the following conditions:

  1. (i) The leading entry of each non-zero row is 1.
  2. (ii) There are only zeros above and below each non-zero leading entry.
  3. (iii) The non-zero leading entries move to the right as one moves down the rows.
  4. (iv) All zero rows are at the bottom of the matrix.

We ended class by noting that the technique of Gaussian elimination, namely converting the system to an augmented matrix and applying elementary row operations to get a RREF, applies to systems of linear equations of any size and we illustrated this by solving the system

\[\begin{align*} x+y+z &= 6\\ 2x+4y+6z &= 28\\ 5x+7y+9z &= 46 \end{align*}\]

which had the solution \(x = 1, y = 2, z = 3\).

Thursday, February 5

We began class by reviewing the possible outcomes in terms of the RREFs for an augmented matrix representing a \(3\times 3\) system of linear equations. We then reminded the class of the important fact that given any system of linear equations, the number independent parameters needed to describe the solution set is the number of variables minus the number of leading 1s in the RREF of the corresponding augmented matrix. We also noted that a system of linear equations is said to be homogeneous if the right hand side of the system consists entirely of zeros. In this case, there will always be at least one solution, namely the solution in which all of the given variables equal zero.

We then noted the following

General fact. Suppose \(A\) is an \(n\times m\) matrix and \(B\) is a \(p\times t\) matrix. Then we can only form the product \(AB\) when \(m = p\). In this case \(AB\) is an \(n\times t\) matrix whose \((i,j)\) entry is the \(i\)th row of \(A\) times the \(j\)th column of \(B\).

After computing a couple of products, we noted that the product rules (i)-(v) from January 22 hold, as long as the product exists. We then began a discussion of elementary \(2\times 2\) matrices.

Elementary \(2\times 2\) matrices. The elementary matrices are obtained by applying elementary row operations to \(I_2 = \begin{pmatrix} 1 & 0\\0 & 1\end{pmatrix}\).

  1. (i) Type I: Interchange the rows of \(I_2\), i.e., \(E = \begin{pmatrix} 0 & 1\\1 & 0\end{pmatrix}\).
  2. (ii) Type II: Add a multiple of one row of \(I_2\) to another, e.g., \(E = \begin{pmatrix} 1 & 0\\ \lambda & 1\end{pmatrix}\) or \(E = \begin{pmatrix} 1 & \lambda\\0 & 1\end{pmatrix}\), for \(\lambda \in \mathbb{R}\).
  3. (iii) Type III: Multiply a row of \(I_2\) by a non-zero number, e.g., \(E = \begin{pmatrix} \lambda & 0\\0 & 1\end{pmatrix}\) or \(E = \begin{pmatrix} 1 & 0\\0 & \lambda\end{pmatrix}\).

The class then verified (in the \(2\times 2\) case) that for any elementary matrix \(E\), \(EA\) is the same matrix obtained by applying the corresponding elementary row operation to \(A\). We then noted that elementary matrices are invertible, and their inverses are elementary matrices that can be easily guessed.

We then asked what does it mean in terms of elementary matrices if \(A\) is a \(2\times 2\) matrix that reduces to \(I_2\) via elementary row operations. We inferred that there exists a \(2\times 2\) matrix \(H\) such that \(HA = I_2\) and thus (via the bonus problem listed in today's homework) \(A\) is invertible and \(A^{-1}\) can be found via Gaussian elimination, as stated below.

Using Gaussian elimination to find \(A^{-1}\), if it exists. Given an \(n\times n\) matrix \(A\), start with the \(n\times (2n)\) augmented matrix \([A\ |\ I_n]\) and apply elementary row operations until either:

  1. (i) One arrives at \([I_n\ | \ B]\), in which case \(B = A^{-1}\) or
  2. (ii) At some point the left hand side of the augmented matrix has a row consisting entirely of zeros, in which case \(A\) does not have an inverse.
In terms of elementary matrices, if \(E_1, ..., E_r\) the are elementary matrices corresponding to the row operations used in (i), then \(A^{-1} = E_r \cdots E_2 E_1\).
Tuesday, February 10

We began class by thinking of \(\mathbb{R}^2\) as a vector space even though we have not yet formally defined the general concept of a vector space. We noted that the elements of \(\mathbb{R}^2\) can be thought of as row or column vectors and as such one can add two vectors or scalar multiply a vector by a real number exactly as one does in Calculus II. We then noted that the elements of \(\mathbb{R}^2\) satisfy the Fundamental Properties from the lecture of January 20. We then pointed out (informally) that it is precisely these properties that make any set with addition and scalar multiplication into a vector space. We then gave the following

Definition-Proposition. Take \(v_1, v_2\in \mathbb{R}^2\). Then \(v_1, v_2\) are said to be linearly independent if the following equivalent statements hold:

  1. (i) If \(\alpha v_1+\beta v_2 = \vec{0}\), for \(\alpha,\beta \in \mathbb{R}\), then \(\alpha = 0 = \beta\).
  2. (ii) We cannot write \(v_1 = \lambda v_2\) or \(v_2 = \lambda v_1\), for \(0\neq \lambda \in \mathbb{R}\).

We noted that geometrically, condition (ii) just says that \(v_1\) and \(v_2\) do not lie on the same line through the origin in \(\mathbb{R}^2\). We also noted that if one of \(v_1, v_2\) is \(\vec{0}\), then \(v_1, v_2\) cannot be linearly independent. We also defined \(v_1, v_2\) to be linearly dependent if they are not linearly independent. We noted that the vectors \((1,2), (2,1)\) are linearly independent, while the vectors \((1,2), (4,8)\) are linearly dependent.

We then proved the following important theorem.

Theorem. Take vectors \(v_1 = (a,b), v_2 = (c,d) \in \mathbb{R}^2\) and set \(A := \begin{pmatrix} a & b\\c & d\end{pmatrix}\). Then \(v_1, v_2\) are linearly independent if and only if \(\det A \neq 0\).

We ended class by noting, but not verifying, that if \(v_1, v_2 \in \mathbb{R}^2\) are linearly independent, then given any vector \(u\in \mathbb{R}^2\), we can find (unique) \(\alpha, \beta \in \mathbb{R}\) such that \(u = \alpha v_1+\beta v_2\).

Thursday, February 12

We began class by reviewing the definition of what it means for two vectors in \(\mathbb{R}^2\) to be linearly independent. This led to a discussion and example of how any vector in \(\mathbb{R}^2\) is a linear combination of a fixed set of linearly independent vectors. In particular, we showed how \((6,9)\) can be written as a linear combination of the independent vectors \((2,3)\) and \((1,3)\). We then formally gave the following definitions.

Definitions. Given \(v_1, v_2\in \mathbb{R}^2\):

  1. (i) A linear combination of \(v_1, v_2\) is an expression of the form \(\alpha v_1+\beta v_2\), with \(\alpha, \beta \in \mathbb{R}\).
  2. (ii) The span of \(v_1, v_2\), denoted \(\langle v_1, v_2\rangle\), is the set of all possible linear combinations of \(v_1, v_2\).

We then established the following theorem:

Theorem. Given vectors \(v_1, v_2\in \mathbb{R}^2\), \(v_1, v_2\) are linearly independent if and only if \(\langle v_1, v_2\rangle = \mathbb{R}^2\).

Important facts. The equivalence in the theorem above is a function of taking two vectors in \(\mathbb{R}^2\). In \(\mathbb{R}^3\), two independent vectors never span \(\mathbb{R}^3\) and four spanning vectors are never linearly independent. However, in \(\mathbb{R}^3\), three vectors are linearly independent if and only if they span \(\mathbb{R}^3\). We will discuss the general notions of linear independence and spanning in the near future.

We ended class by observing that a line through the origin in \(\mathbb{R}^2\) is closed under vector addition and scalar multiplication, and noted that these properties will determine the concept of subspace to be discussed next week.

Tuesday, February 17

We began class with the following important definition.

Definition. A function \(T: \mathbb{R}^2\to \mathbb{R}^2\) is a linear transformation if:

  1. (i) \(T(v_1+v_2) = T(v_1)+T(v_2)\), for all \(v_1, v_2\in \mathbb{R}^2\)
  2. (ii) \(T(\lambda v) = \lambda T(v)\), for all \(\lambda \in \mathbb{R}\) and \(v\in \mathbb{R}^2\).

Examples of linear transformations given in class were:

  1. (i) \(T(x,y) = (2x+2y, -6x+2y)\);
  2. (ii) \(T(x,y) = (-y,x)\), rotation 90 degrees counterclockwise;
  3. (iii) \(T(x,y) = (y,-x+2y)\).

We noted that the second linear transformation has the property that no line through the origin is mapped to itself by the linear transformation (i.e., \(T\) has no eigenvectors), while the third linear transformation maps the line \(y = x\) to itself (i.e., the vector \((1,1)\) is an eigenvector).

We then noted that a linear transformation \(T: \mathbb{R}^2\to \mathbb{R}^2\) is totally determined by its effect on the standard basis of \(\mathbb{R}^2\) given by \(E = \{e_1, e_2\}\), where \(e_1 = (1,0)\) and \(e_2 = (0,1)\). In fact, if \(v = (a,b)\), then

\[T(v) = T(a,b) = T((a,0)+(0,b)) = T(a,0)+T(0,b) = aT(1,0)+bT(0,1) = aT(e_1)+bT(e_2),\]

showing that the value of \(T(v)\) is already determined by the values of \(T(e_1)\) and \(T(e_2)\).

We then gave the following definition, which we noted is a special case of a more general definition to come. We first reminded the class of the theorem from the previous lecture which showed that \(v_1, v_2\in \mathbb{R}^2\) are linearly independent if and only if they span \(\mathbb{R}^2\).

Definition. Two vectors \(v_1, v_2\in \mathbb{R}^2\) form a basis for \(\mathbb{R}^2\) if they span \(\mathbb{R}^2\), or equivalently, they are linearly independent.

After noting that the standard basis for \(\mathbb{R}^2\) mentioned above is a basis for \(\mathbb{R}^2\), we gave the following important definition.

Definition. Let \(T: \mathbb{R}^2\to \mathbb{R}^2\) be a linear transformation, \(\alpha := \{v_1, v_2\}\) and \(\beta := \{w_1, w_2\}\) be bases for \(\mathbb{R}^2\). Then the matrix of T with respect to \(\alpha\) and \(\beta\) is the matrix \([T]_{\alpha}^{\beta} = \begin{pmatrix} a & c\\b & d\end{pmatrix}\), where \(T(v_1) = aw_1+bw_2\) and \(T(v_2) = cw_1+dw_2\).

We then noted that \([T]^E_E\) is the easiest matrix to calculate, namely if \(T(x,y) = (ax+cy, bx+dy)\), then \([T]_E^E = \begin{pmatrix} a & c\\b & d\end{pmatrix}\). From this we noted that if \(v = (e,f)\), then in terms of columns, \(T(v) = T(e,f) = A\begin{pmatrix} e\\f\end{pmatrix}\), where \(A = [T]_E^E\). This equation always holds true as long as we take the standard basis for both the input basis and output basis.

We finished class by noting that if \(T(x,y) = (2x-4y, x+7y)\), then \([T]_E^E = \begin{pmatrix} 2 & -4\\1 & 7\end{pmatrix}\), while if \(B = \{w_1, w_2\}\) with \(w_1 = (1,1)\) and \(w_2 = (1,2)\), \([T]_B^E = \begin{pmatrix} -2 & -6\\8 & 15\end{pmatrix}\), which follows easily from the fact that \(T(1,1) = (-2,8)\) and \(T(1,2) = (-6,15)\). However, to calculate \([T]_E^B\), we saw that this required solving two systems of equations, namely

\[r\begin{pmatrix} 1\\1\end{pmatrix} + s\begin{pmatrix} 1\\2\end{pmatrix} = \begin{pmatrix} 2\\1\end{pmatrix}\quad\quad \textrm{and}\quad\quad u\begin{pmatrix} 1\\1\end{pmatrix} + v\begin{pmatrix} 1\\2\end{pmatrix} = \begin{pmatrix} -4\\7\end{pmatrix}.\]
Thursday, February 19

After recalling the definition of a basis for \(\mathbb{R}^2\), linear transformation and the matrix of a linear transformation with respect to two bases, we considered the following question: Suppose we are given \(S,T: \mathbb{R}^2\to \mathbb{R}^2\) are two linear transformations. Then the composition \(ST:\mathbb{R}^2\to \mathbb{R}^2\) is also a linear transformation. Can a matrix of \(ST\) be expressed in terms of matrices for \(S\) and \(T\)?

Very Important Formula. Suppose \(S, T: \mathbb{R}^2\to \mathbb{R}^2\) are linear transformations, and \(\alpha, \beta, \gamma\) are bases for \(\mathbb{R}^2\). Then \(ST: \mathbb{R}^2\to \mathbb{R}^2\) is a linear transformation and

\[[ST]_{\alpha}^{\gamma} = [S]_{\beta}^{\gamma}\cdot [T]_{\alpha}^{\beta}.\]

After proving this formula, we described change of basis matrices, by noting that if \(\alpha, \beta\) are bases for \(\mathbb{R}^2\), then \([I_2]_{\alpha}^{\beta}\) is the matrix obtained by expressing the basis elements of \(\alpha\) in terms of the basis \(\beta\). Similarly, \([I_2]_{\beta}^{\alpha}\) is the matrix obtained by expressing the basis elements of \(\beta\) in terms of \(\alpha\). Using the very important formula above we saw that the matrices are inverses of one another:

\[I_2 = [I_2]_{\beta}^{\beta} = [I_2]_{\alpha}^{\beta} \cdot[I_2]_{\beta}^{\alpha}\quad \textrm{and}\quad I_2 = [I_2]_{\alpha}^{\alpha} = [I_2]_{\beta}^{\alpha} \cdot[I_2]_{\alpha}^{\beta}.\]

This led to the fundamental formula, which follows from the very important formula.

Change of Basis Formula. Suppose \(T: \mathbb{R}^2\to \mathbb{R}^2\) is a linear transformation and \(\alpha, \beta\) are bases for \(\mathbb{R}^2\). Then

\[[T]_{\beta}^{\beta} = [I_2]_{\alpha}^{\beta}\cdot [T]_{\alpha}^{\alpha}\cdot [I_2]_{\beta}^{\alpha}.\]

In particular, if we set \(A := [T]_{\alpha}^{\alpha}\) , \(B := [T]_{\beta}^{\beta}\) and \(P:= [I_2]_{\beta}^{\alpha}\), we obtain the classic expression

\[B = P^{-1}AP.\]
Tuesday, February 24

The class worked on practice problems for Exam I.

Thursday, February 26

Exam 1.

Tuesday, March 3

After making a few comments about Exam 1, we looked at the following

Example A. For the matrix \(A = \begin{pmatrix} 0 & -2\\1 & 3\end{pmatrix}\), find a non-zero column vector \(v\in \mathbb{R}^2\) and \(\lambda \in \mathbb{R}\) such that \(Av = \lambda v\).

We approached this example by setting \(v = \begin{pmatrix} x\\y\end{pmatrix}\) and setting up a system of equations, that ultimately became the matrix equation \(\begin{pmatrix} -\lambda & -2\\1 & -\lambda+3\end{pmatrix} \begin{pmatrix} x\\y\end{pmatrix} = \begin{pmatrix} 0\\0\end{pmatrix}\). Since the vector \(v\) should be non-zero, the matrix equation should have a non-trivial solution, which happens when \(\det \begin{pmatrix} -\lambda & -2\\1 & -\lambda+3\end{pmatrix} = 0\). This led to the polynomial equation \(\lambda^2-3\lambda+2 = 0\), which has solutions \(\lambda = 2\) and \(\lambda = 1\). These are the values of \(\lambda\) we seek. What about the corresponding vectors? We saw that when \(\lambda = 2\), the matrix equation becomes \(\begin{pmatrix} -2 & -2\\1 & 1\end{pmatrix} \begin{pmatrix} x\\y\end{pmatrix} = \begin{pmatrix}0\\0\end{pmatrix}\), which is easily seen to have a non-trivial solution \(v_1 = \begin{pmatrix} 1\\-1\end{pmatrix}\). Similarly, we saw that when \(\lambda = 1\), the matrix equation has coefficient matrix \(\begin{pmatrix} -1 & -2\\1 & 2\end{pmatrix}\), so \(v_2 = \begin{pmatrix} 2\\-1\end{pmatrix}\) is a non-trivial solution. Thus, we found two vectors \(v_1, v_2\) such that \(Av_1 = 2\cdot v_1\) and \(Av_2 = 1\cdot v_2\). We also noted that any multiple of \(v_1\) works for \(\lambda = 2\) and any multiple of \(v_2\) works for \(\lambda = 1\).

We then noted that \(v_1, v_2\) form a basis for \(\mathbb{R}^2\), since the determinant of \(P := \begin{pmatrix} 1 & 2\\-1 & -1\end{pmatrix}\) is not zero. We then saw that \(P^{-1} = \begin{pmatrix} -1 & -2\\1 & 1\end{pmatrix}\) and that if we set \(B = \begin{pmatrix} 2 & 0\\0 & 1\end{pmatrix}\), then \(B = P^{-1}AP\). Thus, we saw

Summary of Example A. First, there exist non-zero vectors \(v_1, v_2 \in \mathbb{R}^2\) with \(Av_1 = 2\cdot v_1\) and \(Av_2 = 1\cdot v_2\). The values 2 and 1 were found by solving the equation obtained by setting \(\det \begin{pmatrix}-\lambda & -2\\1 & -\lambda +3\end{pmatrix} = 0\); and second, if we take \(P\) to be the matrix with columns \(v_1, v_2\), then \(P^{-1}AP = \begin{pmatrix} 2 & 0\\0 & 1\end{pmatrix}\).

This led to the following

Definitions. Suppose \(A\in \textrm{M}_2(\mathbb{R})\).

  1. (i) \(\lambda \in \mathbb{R}\) is an eigenvalue of \(A\) if there exists a non-zero vector \(v\in \mathbb{R}^2\) such that \(Av = \lambda v\). Any non-zero vector \(v\) satisfying \(Av = \lambda v\) is an eigenvector of \(A\) associated to \(\lambda\).
  2. (ii) \(A\) is said to be diagonalizable if there exists an invertible matrix \(P\in \textrm{M}_2(\mathbb{R})\) such that \(P^{-1}AP\) is a diagonal matrix, i.e., \(P^{-1}AP = \begin{pmatrix} \lambda_1 & 0\\0 & \lambda_2\end{pmatrix}\), for \(\lambda_1, \lambda_2\in \mathbb{R}\).

This led to formalizing the process of finding eigenvalues.

Proposition-Definition. Let \(A\in \textrm{M}_2(\mathbb{R})\). The eigenvalues of \(A\) are found by solving the polynomial equation \(\det (A-\lambda I_2) = 0\). The resulting polynomial \(\det (A-\lambda I_2)\) is called the characteristic polynomial of \(A\). If \(\lambda\) is an eigenvalue, then eigenvectors corresponding to \(\lambda\) are obtained by taking non-zero solutions to the system of equations \((A-\lambda I_2)\begin{pmatrix} x\\y\end{pmatrix} = \begin{pmatrix} 0\\0\end{pmatrix}\).

We ended class by repeating the steps in Example A for the matrix \(B = \begin{pmatrix} -1 & 2\\4 & -3\end{pmatrix}\).

Thursday, March 5

We began class by reviewing the following for \(A \in \textrm{M}_2(\mathbb{R})\):

  1. (i) \(\lambda \in \mathbb{R}\) is an eigenvalue of \(A\) if \(Av = \lambda v\), for some non-zero vector \(v\in \mathbb{R}^2\).
  2. (ii) Any non-zero vector satisfying \(Av = \lambda v\) is an eigenvector of \(A\) associated to the eigenvalue \(\lambda\).
  3. (iii) The eigenvalues are found by solving the equation \(\det (A-\lambda I_2) = 0\) for \(\lambda\). In other words, \(\alpha\) is an eigenvalue of \(A\) if and only if \(\alpha\) is a root of the polynomial \(p_A(x) := \det (A-xI_2)\). The polynomial \(p_A(x)\) is called the characteristic polynomial of \(A\).
  4. (iv) If \(\alpha\) is an eigenvalue of \(A\), the corresponding eigenvectors are the vectors \(v\in \mathbb{R}^2\) solving the matrix equation \((A-\alpha I_2)\cdot v = \vec{0}\), or equivalently, the ordered pairs that are solutions to the system of equations \[\begin{aligned}(-\alpha +a)x+by &=0\\ cx+(-\alpha + d)y &= 0,\end{aligned}\] where \(A = \begin{pmatrix} a & b\\c & d\end{pmatrix}\).
  5. (v) \(A\) is said to be diagonalizable if there exists invertible \(P\in \textrm{M}_2(\mathbb{R})\) such that \(P^{-1}AP = \begin{pmatrix} \alpha & 0\\0 & \beta\end{pmatrix}\), for some \(\alpha, \beta \in \mathbb{R}\).

We then considered the following

Example. Taking \(A = \begin{pmatrix} 2 & 1\\1 & 2\end{pmatrix}\), we found the eigenvalues 3, 1 with corresponding eigenvectors \(v_1 = \begin{pmatrix} 1\\-1\end{pmatrix}\) and \(v_2 = \begin{pmatrix} 1\\1\end{pmatrix}\). We then showed that for \(P = \begin{pmatrix} 1 & 1\\-1 & 1\end{pmatrix}\), \(P^{-1}AP = \begin{pmatrix} 1 & 0\\0 & 3\end{pmatrix}\).

We then observed that the characteristic polynomial of the matrix \(A = \begin{pmatrix} 0 & -1\\1 & 0\end{pmatrix}\) is \(x^2+1\), so \(A\) does not have any (real) eigenvalues. This was explained geometrically, since multiplication of a vector in \(\mathbb{R}^2\) by \(A\) rotates the vector 90 degrees counter-clockwise, so that no vector in \(\mathbb{R}^2\) is mapped to a multiple of itself.

We then stated the following

Key Observations making theoretical discussions easier. Let \(H\) be a \(2\times 2\) matrix over \(\mathbb{R}\).

  1. (i) For \(w = \begin{pmatrix} \alpha\\\beta\end{pmatrix} \in \mathbb{R}^2\), \(Hw = \alpha C_1+\beta C_2\), where \(C_1, C_2\) are the columns of \(H\).
  2. (ii) Let \(L = [D_1\ D_2]\), i.e., \(L\) is a \(2\times 2\) matrix whose columns are \(D_1, D_2\). Then \(HL = [HD_1\ HD_2]\).

We ended class by stating, but not proving the following theorem, which formalizes the processes used in our previous examples.

Diagonalizability Theorem. Let \(A\) be a \(2\times 2\) matrix with entries in \(\mathbb{R}\).

  1. (i) Suppose \(A\) has two linearly independent eigenvectors \(v_1, v_2\), so that \(v_1, v_2\) form a basis for \(\mathbb{R}^2\). Then \(A\) is diagonalizable. More explicitly, if \(Av_1 = \alpha v_1\) and \(Av_2 = \beta v_2\), then \(P^{-1}AP = \begin{pmatrix} \alpha & 0\\0 & \beta\end{pmatrix}\), where \(P\) is the \(2\times 2\) matrix whose columns are \(v_1\) and \(v_2\).
  2. (ii) Suppose \(A\) is diagonalizable, i.e., there exists an invertible \(2\times 2\) matrix \(P\) such that \(P^{-1}AP = \begin{pmatrix} \alpha & 0\\0 & \beta\end{pmatrix}\), for \(\alpha, \beta \in \mathbb{R}\). Then, \(\alpha, \beta\) are the eigenvalues of \(A\), and if \(v_1, v_2\) are the columns of \(P\), \(Av_1 = \alpha v_1\) and \(Av_2 = \beta v_2\). In particular, \(\mathbb{R}^2\) has a basis consisting of eigenvectors of \(A\), since the columns of \(P\) form a basis for \(\mathbb{R}^2\).

Thus, \(A\) is diagonalizable if and only if \(\mathbb{R}^2\) has a basis consisting of eigenvectors of \(A\).

Tuesday, March 10

We began class by re-stating the Diagonalizability Theorem from the end of the previous lecture. With the theorem in mind, we then considered the following three scenarios.

Three possibilities. Suppose \(A \in \mathrm{M}_2(\mathbb{R})\), so that \(p_A(x)\) is a degree two polynomial with real coefficients. One of the following scenarios holds:

  1. (i) \(p_A(x)\) has two distinct (real) roots.
  2. (ii) \(p_A(x)\) has a repeated root, i.e., \(p_A(x) = (x-\lambda)^2\), for some \(\lambda \in \mathbb{R}\).
  3. (iii) \(p_A(x)\) has no real roots.

We noted that we have previously seen examples of types (i) and (iii). We then noted that for \(A = \begin{pmatrix} 2 & 7\\0 & 2\end{pmatrix}\), \(p_A(x) = (x-2)^2\) and that all eigenvectors are multiples of \(v = \begin{pmatrix} 1\\0\end{pmatrix}\), so that by the theorem, \(A\) is not diagonalizable.

We then gave a proof of the theorem characterizing diagonalizability. The proof of the theorem relied heavily on the two Key Observations from the previous lecture. We finished class by presenting the following very important corollary to the diagonalizability theorem.

Corollary. Suppose \(A\in \mathrm{M}_2(\mathbb{R})\) has two distinct eigenvalues. Then \(A\) is diagonalizable.
Thursday, March 12

Most of the class was spent discussing how diagonalizability of a \(2\times 2\) matrix can be used to solve a coupled system of two linear first order differential equations. In particular, given the system

\[\begin{align*} x_1'(t) &= ax_1(t)+bx_2(t)\\ x_2'(t) &= cx_1(t)+dx_2(t), \end{align*}\]

assuming \(A\) is diagonalizable, we worked through the derivation of the solution to the system and found that the solution took the form

\[\begin{pmatrix} x_1(t)\\x_2(t)\end{pmatrix} = c_1e^{\alpha t}v_1+c_2e^{\beta t}v_2,\]

where \(\alpha, \beta\) are the eigenvalues of \(A\), with corresponding eigenvectors \(v_1, v_2\in \mathbb{R}^2\), and where \(c_1, c_2\in \mathbb{R}\) are determined by the initial conditions \(\begin{pmatrix} x_1(0)\\x_2(0)\end{pmatrix} = P\cdot \begin{pmatrix} c_1\\c_2\end{pmatrix}\), for \(P = [v_1\ v_2]\).

We finished class by using the derivation above to solve the system

\[\begin{align*} x_1'(t) &= 2x_1(t)+x_2(t)\\ x_2'(t) &= x_1(t)+2x_2(t) \end{align*}\]

with initial conditions \(x_1(0) = 3, x_2(0) = -4\).